Chapter 2 


AN ISI RESEARCH FRAMEWORK: 
INFORMATION SHARING AND DATA 
MINING 


Chapter Overview 


To address the data and technical challenges facing ISI, we present a 
research framework with a primary focus on KDD (Knowledge Discovery 
from Databases) technologies. The framework is discussed in the context of 
crime types and security implications. Selected data mining techniques, 
including information sharing and collaboration, association mining, 
classification and clustering, text mining, spatial and temporal mining, and 
criminal network analysis, are believed to be critical to criminal and 
intelligence analyses and investigations. In addition to the technical 
discussions, the chapter also discusses caveats for data mining and important 
civil liberties considerations. 
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2.1 Introduction 


Crime is an act or the commission of an act that is forbidden, or the 
omission of a duty that is commanded by a public law and that makes the 
offender liable to punishment by that law. The more threat a crime type 
poses on public safety, the more likely it is to be of national security 
concern. Some crimes such as traffic violations, theft, and homicide are 
mainly in the jurisdiction of local law enforcement agencies. Some other 
crimes need to be dealt with by both local law enforcement and national 
security authorities. Identity theft and fraud, for instance, are relevant at both 
the local and national level -- criminals may escape arrest by using false 
identities; drug smugglers may enter the United States by holding 
counterfeited passports or visas. Organized crimes, such as terrorism and 
narcotics trafficking, are often diffuse geographically, resulting in common 
security concerns across cities, states, and countries. Cybercrimes can pose 
threats to public safety across multiple jurisdictional areas due to the 
widespread nature of computer networks. 

Table 2-1 summarizes the different types of crimes sorted by the degree 
of their respective public influence (Chen et al., 2004a). International and 
domestic terrorism, in particular, often involves multiple crime types (e.g., 
identity theft, money laundering, arson and bombing, organized and violent 
activities, and cyber-terrorism) and causes great damage. 


2.2 An ISI Research Framework 


We believe that KDD techniques can play a central role in improving 
counter-terrorism and crime-fighting capabilities of intelligence, security, 
and law enforcement agencies by reducing the cognitive and information 
overload. Knowledge discovery refers to non-trivial extraction of implicit, 
previously unknown, and potentially useful knowledge from data. 
Knowledge discovery techniques promise easy, convenient, and practical 
exploration of very large collections of data for organizations and users, and 
have been applied in marketing, finance, manufacturing, biology, and many 
other domains (e.g., predicting consumer behaviors, detecting credit card 
frauds, or clustering genes that have similar biological functions) (Fayyad 
and Uthurusamy, 2002). Traditional knowledge discovery techniques 
include association rules mining, classification and prediction, cluster 
analysis, and outlier analysis (Han and Kamber, 2001). As natural language 
processing (NLP) research advances, text mining approaches that 
automatically extract, summarize, categorize, and translate text documents 
have also been widely used (Chen, 2001; Trybula, 1999). 


